Takagi Factorization on GPU using CUDA

نویسندگان

  • Gagandeep S. Sachdev
  • Mary W. Hall
چکیده

Takagi factorization or symmetric singular value decomposition is a special form of SVD applicable to symmetric complex matrices. The computation takes advantage of symmetry to reduce computation and storage requirements. The Jacobi method with chess tournament ordering was used to perform the computation in parallel on a GPU using the CUDA programming model. We were able to achieve speedups of over 11x and 7x over CPU serial and Pthreads implementations, respectively, for matrix sizes greater than 512 × 512.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accelerating high-order WENO schemes using two heterogeneous GPUs

A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...

متن کامل

An approach to Improve Particle Swarm Optimization Algorithm Using CUDA

The time consumption in solving computationally heavy problems has always been a concern for computer programmers. Due to simplicity of its implementation, the PSO (Particle Swarm Optimization) is a suitable meta-heuristic algorithm for solving computationally heavy problems. However, despite the simplicity, the algorithm is inefficient for solving real computationally heavy problems but the pr...

متن کامل

Parallelization of Rich Models for Steganalysis of Digital Images using a CUDA-based Approach

There are several different methods to make an efficient strategy for steganalysis of digital images. A very powerful method in this area is rich model consisting of a large number of diverse sub-models in both spatial and transform domain that should be utilized. However, the extraction of a various types of features from an image is so time consuming in some steps, especially for training pha...

متن کامل

GPU-Accelerated Parallel Sparse LU Factorization Method for Fast Circuit Analysis

Lower upper (LU) factorization for sparse matrices is the most important computing step for circuit simulation problems. However, parallelizing LU factorization on the graphic processing units (GPUs) turns out to be a difficult problem due to intrinsic data dependence and irregular memory access, which diminish GPU computing power. In this paper, we propose a new sparse LU solver on GPUs for ci...

متن کامل

Non-negative Matrix Factorization on GPU

Today, the need of large data collection processing increase. Such type of data can has very large dimension and hidden relationships. Analyzing this type of data leads to many errors and noise, therefore, dimension reduction techniques are applied. Many techniques of reduction were developed, e.g. SVD, SDD, PCA, ICA and NMF. Non-negative matrix factorization (NMF) has main advantage in process...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010